Neural Programming and an Internal Reinforcement Policy
نویسنده
چکیده
An important reason for the continued popularity of Artificial Neural Networks (ANNs) in the machine learning community is that the gradient-descent backpropagation procedure gives ANNs a locally optimal change procedure and, in addition, a framework for understanding the ANN learning performance. Genetic programming (GP) is also a successful evolutionary learning technique that provides powerful parameterized primitive constructs. Unlike ANNs, though, GP does not have such a principled procedure for changing parts of the learned system based on its current performance. This paper introduces Neural Programming, a connectionist representation for evolving programs that maintains the benefits of GP. The connectionist model of Neural Programming allows for a regression credit-blame procedure in an evolutionary learning system. We describe a general method for an informed feedback mechanism for Neural Programming, Internal Reinforcement. We introduce an Internal Reinforcement procedure and demonstrate its use through an illustrative experiment.
منابع مشابه
Extracting Dynamics Matrix of Alignment Process for a Gimbaled Inertial Navigation System Using Heuristic Dynamic Programming Method
In this paper, with the aim of estimating internal dynamics matrix of a gimbaled Inertial Navigation system (as a discrete Linear system), the discretetime Hamilton-Jacobi-Bellman (HJB) equation for optimal control has been extracted. Heuristic Dynamic Programming algorithm (HDP) for solving equation has been presented and then a neural network approximation for cost function and control input ...
متن کاملcient Learning Through Evolution : Neural Programming and Internal
Genetic programming (GP) can learn complex concepts by searching for the target concept through evolution of population of candidate hypothesis programs. However, unlike some learning techniques, such as Artiicial neural networks (ANNs), GP does not have a principled procedure for changing parts of a learned structure based on that structure's performance on the training data. GP is missing a c...
متن کاملAdaptive Critic Based Adaptation of A Fuzzy Policy Manager for A Logistic System
We show that a reinforcement learning method, adaptive critic based approximate dynamic programming, can be used to create fuzzy policy managers for adaptive control of a logistic system. Two different architectures are used for the policy manager, a feed forward neural network, and a fuzzy rule base. For both architectures, policy managers are trained that outperform LP and GA derived fixed po...
متن کاملOptimal Asset Allocation using Adaptive Dynamic Programming
In recent years, the interest of investors has shifted to computerized asset allocation (portfolio management) to exploit the growing dynamics of the capital markets. In this paper, asset allocation is formalized as a Markovian Decision Problem which can be optimized by applying dynamic programming or reinforcement learning based algorithms. Using an artificial exchange rate, the asset allocati...
متن کاملLearning to Plan Probabilistically from Neural Networks
| This paper discusses the learning of probabilis-tic planning without a priori domain-speciic knowledge. Diierent from existing reinforcement learning algorithms that generate only reactive policies and existing probabilis-tic planning algorithms that requires a substantial amount of a priori knowledge in order to plan, we devise a two-stage bottom-up learning-to-plan process, in which rst rei...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996